Fidelity varies in the symbiosis between a gutless marine worm and its microbial consortium

Background Many animals live in intimate associations with a species-rich microbiome. A key factor in maintaining these beneficial associations is fidelity, defined as the stability of associations between hosts and their microbiota over multiple host generations. Fidelity has been well studied in terrestrial hosts, particularly insects, over longer macroevolutionary time. In contrast, little is known about fidelity in marine animals with species-rich microbiomes at short microevolutionary time scales, that is at the level of a single host population. Given that natural selection acts most directly on local populations, studies of microevolutionary partner fidelity are important for revealing the ecological and evolutionary processes that drive intimate beneficial associations within animal species. Results In this study on the obligate symbiosis between the gutless marine annelid Olavius algarvensis and its consortium of seven co-occurring bacterial symbionts, we show that partner fidelity varies across symbiont species from strict to absent over short microevolutionary time. Using a low-coverage sequencing approach that has not yet been applied to microbial community analyses, we analysed the metagenomes of 80 O. algarvensis individuals from the Mediterranean and compared host mitochondrial and symbiont phylogenies based on single-nucleotide polymorphisms across genomes. Fidelity was highest for the two chemoautotrophic, sulphur-oxidizing symbionts that dominated the microbial consortium of all O. algarvensis individuals. In contrast, fidelity was only intermediate to absent in the sulphate-reducing and spirochaetal symbionts with lower abundance. These differences in fidelity are likely driven by both selective and stochastic forces acting on the consistency with which symbionts are vertically transmitted. Conclusions We hypothesize that variable degrees of fidelity are advantageous for O. algarvensis by allowing the faithful transmission of their nutritionally most important symbionts and flexibility in the acquisition of other symbionts that promote ecological plasticity in the acquisition of environmental resources. Video Abstract Supplementary Information The online version contains supplementary material available at 10.1186/s40168-022-01372-2.

). In associations with strict fidelity, genetic variants of hosts and symbionts show phylogenetic concordance. Strict fidelity is favoured in associations in which the symbionts are transmitted vertically, that is directly from hosts to their offspring. However, fidelity in vertically transmitted symbioses can be disrupted by host switching, symbiont displacement, acquisition of novel symbionts from free-living microorganisms and symbiont loss. Fidelity is generally weaker in associations with horizontal symbiont transmission in which the symbionts are acquired from free-living microbial populations or co-occurring hosts [4,5]. However, strong fidelity can also occur in symbioses with horizontal transmission if genotype-dependent partner choice ensures the faithfulness of the association [4].
Partner fidelity has been well studied in obligate associations with only one or a few symbionts, such as aphids [12][13][14], tsetse flies [15], Riftia tube worms [16], Vesicomyidae clams [17][18][19], Solemya clams [20] and the Hawaiian bobtail squid [21]. However, as the number and diversity of symbiont species in a host increase, analysing partner fidelity over multiple host generations and across hundreds to thousands of microbial species and strains that often evolve rapidly and are in continuous flux is highly challenging [22,23]. In hosts with highly diverse microbiota, such as sponges [24][25][26], ascidians [26], corals [26,27], some insects [28][29][30] and mammals [31][32][33], phylosymbiosis is therefore the approach most often used (see Table 1). In phylosymbiosis studies, analyses of microbial metagenomes, microbial marker genes, or parts of them, like the 16S rRNA gene, are used to examine if the composition of a given group of host-associated microbiomes reflects the phylogeny of these hosts [28,29,31,[34][35][36]. However, phylosymbiosis is different from partner fidelity. In phylosymbiosis studies, the similarity of microbial communities across hosts is analysed, while in fidelity studies, the coinheritance of symbiont and host genotypes is investigated (see Table 1). By characterizing patterns of host-symbiont coinheritance, partner fidelity can provide more detailed insights into the ecological and evolutionary processes that affect the acquisition and persistence of each single member of complex symbiont communities.
Previous studies of partner fidelity in associations with complex symbiont communities have focussed on terrestrial animals, such as insects [37][38][39][40][41] and humans Table 1 Definition of key terms used in this study: As many of the terms below are used inconsistently in the literature, we explain here how we interpret them.

Vertical transmission
The direct transmission of a symbiont from a parent to its offspring. In most symbioses, the transmission is from mother to offspring (maternal), but there are cases of paternal transmission [6,7].

Horizontal transmission
The transmission of a symbiont to a host from the environment or a co-occurring host [6].

Mixed-mode transmission
The transmission of a symbiont by vertical transmission mixed with occasional or frequent events of horizontal transmission over evolutionary time [5]. Note that the transmission of a symbiont community from one generation to the next in which some members are transmitted vertically and others horizontally is not meant here when using this term.

Partner fidelity
The stability of the association between host and symbiont genotypes over multiple host generations [8]. Partner fidelity is generated by vertical symbiont transmission or genotype-dependent partner choice in horizontal symbiont transmission [4]. Note that studies on mechanism of how microbiomes are transmitted across 1-2 generations in host individuals (e.g. parents to offspring) are not the same as partner fidelity studies, which examine fidelity across many generations in multiple individuals. In this study, we used congruent phylogenies of host mitochondrial genomes and symbiont genomes at microevolutionary scales as an indicator of partner fidelity.

Partner choice
The ability of hosts, symbionts or both to preferentially choose their partner. Partner choice describes interactions between individual partners within their lifetime and is distinct from partner fidelity requiring repeated interactions over evolutionary time [8,9].

Partner specificity
The taxonomic range of partners in an association [10]. Symbiont specificity is defined as the range of symbionts with which a host associates, while host specificity is defined as the range of hosts with which a symbiont associates. In this study, we distinguish partner specificity from partner fidelity, as the former measures the possible diversity of host-symbiont associations, but not the stability of each association.

Coinheritance
The transmission of two or more traits from a host parent to its offspring. Traits can include any combination of phenotypes, genes, alleles, organelles and symbionts. In this study, we use the term to describe the coinheritance of mitochondria and symbionts from parents to their offspring.

Phylosymbiosis
Microbial community relationships that recapitulate the phylogeny of their hosts [11]. Phylosymbiosis tests how similar the composition of microbial communities is to the phylogeny of host species and can arise through ecological or evolutionary forces. Phylosymbiosis differs from partner fidelity in that the structure of the microbial community is analysed, not the phylogeny of each symbiont taxon.

Microevolution
Evolutionary change in a population over short time scales and generally applied to evolution within a species or conspecific populations.

Macroevolution
Evolutionary change over longer time scales and generally applied to evolution across species and higher taxonomic groups. [42][43][44][45], but the little that is known about partner fidelity in marine animals with a diverse and species-rich microbiome largely stems from studies on sponges and corals (e.g. [46,47] [49,50]. In summary, assumptions about partner fidelity based on macroevolutionary scales may not reflect ecologically relevant, microevolutionary interactions between hosts and their microbial communities. Here we investigated microevolutionary partner fidelity in the marine annelid Olavius algarvensis (Fig. 1a). This gutless marine worm is an excellent model for investigating partner fidelity in hosts with multimember symbiont communities, as it lives in an intimate, beneficial association with six to seven symbiont species. Its microbiota is therefore complex and diverse, yet simple enough for investigating the fidelity of each single symbiont species. All gutless annelid species (Clitellata, Naididae, Phallodrilinae, genera Fig. 1 The O. algarvensis population in two bays off the island of Elba was dominated by two mitochondrial haplotypes. a Light microscopy image of Olavius algarvensis. b Fluorescence in situ hybridization image of an O. algarvensis cross section, highlighting the symbionts just below the cuticle of the host (gammaproteobacterial symbionts in green and deltaproteobacterial symbionts in red, using general probes for these two phyla). Reproduced with permission from Kleiner et al. [67]. c and d Location of the two collection sites, Sant' Andrea and Cavoli, two bays off the island of Elba in the Mediterranean. e Haplotype network of mitochondrial cytochrome c oxidase subunit 1 (COI) gene sequences of O. algarvensis individuals from the two collection sites. The two dominant COI haplotypes A and B co-occurred in both bays. The size of the pie charts corresponds to COI haplotype frequencies. Hatch marks correspond to the number of point mutations between COI haplotypes. Nodes depicted by small red points indicate unobserved intermediates predicted by the algorithm in the haplotype network software. The number of individuals identified as COI haplotype A or B in each bay is in parentheses in the box below the network.
Olavius and Inanidrilus; sensu Erséus et al. [51]) are regularly associated with at least three to seven symbiont species from different genera and phyla that co-occur within single host individuals [52][53][54][55][56][57][58][59]. All symbionts are harboured in an extracellular region immediately under the outer cuticle of the host (Fig. 1b). Over the estimated 50 million years these hosts have evolved from their gut-bearing ancestors [60], they have become so fully dependent on their symbionts for both nutrition and recycling of their waste compounds that they no longer have a mouth, digestive tract and excretory system [52][53][54]61]. Symbiont transmission occurs vertically through smearing when the parents deposit their eggs in the sediment, based on transmission electron microscopy and fluorescence in situ hybridization (FISH) studies of three host species (Inanidrilus leukodermatus, Olavius planus and Olavius algarvensis) [59,[62][63][64]. However, given the morphological similarity of several members of the bacterial symbiont community, these studies could not resolve if all symbionts are inherited through egg smearing or if some are acquired horizontally from the environment. Furthermore, there is evidence for host switching and displacement in the dominant, sulphur-oxidizing symbiont Candidatus Thiosymbion over longer evolutionary time [65,66].
In the best-studied gutless marine annelid, O. algarvensis, seven symbiont species have been identified -two sulphur-oxidizing Gammaproteobacteria: Ca. Thiosymbion and Gamma3; four sulphate-reducing Deltaproteobacteria: Delta1a, Delta1b, Delta3 and Delta4; and a spirochete [52-55, 68, 69]. Of these seven symbionts, not all consistently co-occur in all host individuals, but all hosts harbour the sulphur-oxidizing symbiont Ca. Thiosymbion, the sulphate-reducing Delta1a or Delta1b symbiont and the spirochete [55]. The gammaproteobacterial, and possibly the deltaproteobacterial symbionts, autotrophically fix CO 2 and engage in syntrophic cycling of oxidized and reduced sulphur compounds [52,53,67]. Nutrient transfer to the host occurs via phagolysosomal digestion of the symbionts in the epidermal cells underneath the symbiont layer [70,71]. Evidence from genomic, proteomic and stable isotope analyses indicates resource partitioning, with different symbiont species favouring different energy and carbon sources [53,54,61,70]. This metabolic niche differentiation among these co-occurring symbionts, together with the variability in their abundances across host individuals, indicates different levels of selective pressure, which could be reflected in varying degrees of partner fidelity [72].
In this study, we examined partner fidelity in O. algarvensis collected off the island of Elba, Italy, by analysing single-nucleotide polymorphisms (SNPs) across 80 metagenomes. For the hosts, we analysed the phylogeny of their mitochondrial genomes as indicators of vertical symbiont transmission. We then compared mitochondrial phylogeny with that of each symbiont to identify levels of congruence, and correspondingly fidelity, within the microbial consortium. We used a low-coverage sequencing approach for nonmodel organisms that has not yet been applied to hostmicrobe associations [73,74], allowing the analysis of the entire microbial consortium, including low-abundance symbionts.

Two mitochondrial haplotypes dominated the host population
To identify mitochondrial genetic diversity in the O. algarvensis population, we sequenced the mitochondrial cytochrome c oxidase subunit I (COI) gene of 579 O. algarvensis individuals collected over 4 years from two bays approximately 16 km apart on the island of Elba, Italy (Sant' Andrea and Cavoli; Fig. 1 c and d). A haplotype network, based on 579 COI sequences of 525 bp, revealed two dominant mitochondrial COI haplotypes, here termed A and B, which co-occurred at both locations and were distinct from each other by five nucleotides (Fig. 1e). We sequenced the metagenomes of 20 individuals from the two dominant A and B haplotypes from both locations (in total 80 metagenomes) and assembled complete circular genomes of host mitochondria (mtDNA) from an arbitrarily selected metagenome for A and B haplotypes. The mtDNAs of A and B haplotypes shared 99.3% average nucleotide identity (ANI) and encoded 13 protein-coding genes, 2 ribosomal RNAs and 21 transfer RNAs (A = 15,715 bp and B = 15,730 bp). The close phylogenetic relatedness between these two host mitochondrial haplotypes that co-occurred in both bays allowed us to examine the effects of both genetics and geographic location on partner fidelity in O. algarvensis.

Intraspecific symbiont diversity was low across and within host individuals
We assembled metagenome-assembled genomes (MAGs) for each of the seven symbionts in O. algarvensis and used them as references for all further analyses (Supplementary Table S1; Supplementary Figs. S1 and S2). To ensure that each of the seven symbionts belonged to the same species across all O. algarvensis individuals, we examined average nucleotide identity (ANI) of all MAGs that could be recovered from each symbiont species (Supplementary Fig. S3). MAGs from the same species shared more than 95% pairwise ANI, and their 16S rRNA genes were also more than 99% identical (Supplementary Figs. S3 and S4, the only exception was one Ca. Thiosymbion pair that had a slightly lower ANI of 94.7%). These sequence similarities are widely accepted as thresholds for identifying bacterial species (95% ANI for MAGs [75][76][77] and 98.5 or 99% for 16S rRNA genes [78,79]).
Symbiont strain diversity within single host individuals was low based on SNP densities in the seven symbiont genomes (0.02-0.49 SNP/kbp; Supplementary Table S2b). These values are lower than or comparable to SNP densities reported for endosymbionts transmitted vertically with repeated events of horizontal transmission, such as the shallow water bivalve Solemya velum and the deep-sea scaly-foot snail Chrysomallon squamiferum (0.1-1 SNP/kbp and 0.004-3.5 SNP/kbp, respectively) [20,80], and considerably lower than SNP densities in horizontally transmitted symbionts of the giant tubeworm Riftia pachyptila (2.9 SNP/kbp) [81] and deep-sea Bathymodiolus mussels (5-11 SNP/kbp) [82]. Since strain diversity was low within host individuals, we treated each symbiont within an O. algarvensis individual as a single genotype in the analyses described below.

Sulphur-oxidizing, sulphate-reducing and spirochete symbionts were found in all host individuals
We assessed the relative abundance of symbionts within each of the 80 O. algarvensis individuals by quantifying sequencing read abundances for singlecopy genes specific to each symbiont species ( Fig. 2; Supplementary text 1.1; 162 to 431 single-copy genes per symbiont species). The sulphur-oxidizing symbionts, Ca. Thiosymbion and Gamma3, as well as the spirochete, were present in all individuals (Fig. 2a). All host individuals also always had sulphate-reducing symbionts (Delta1a, Delta1b, Delta3 and Delta4), but these varied across host individuals, and no individuals hosted all of them. Delta3 was detected in only six host individuals, making statistical tests meaningless and therefore excluded from subsequent phylogenetic analyses.
Based on relative read abundances for single-copy genes, Ca. Thiosymbion was the most abundant symbiont across host individuals (41.6 ± 11.6%; mean ± standard deviation; Fig. 2b), while the abundance of Gamma3 symbiont reads varied considerably in host individuals from 0.1 to 61.2% (28.3 ± 11.3%). The summed relative abundances of the sulphur-oxidizing gammaproteobacteria (69.9 ± 7.4%) and sulphate-reducing deltaproteobacteria (26.7 ± 7.2%) showed consistent ratios across the 80 host individuals, regardless of the location, host COI haplotype, or the combination of these two factors (Supplementary Table S3). Read abundances of the spirochete symbiont were consistently low in host individuals (3.4 ± 2.7%), with one exception of 17.3%.

Probabilistic SNP identification increased the number of hosts and SNP sites for phylogenetic reconstruction of low-abundance symbionts
Conventionally, SNPs in symbiont genomes are identified using a deterministic genotype-calling approach that requires a minimum read coverage (e.g. at least fivefold [83]; Supplementary text 1.4). In our dataset, this coverage requirement excluded symbionts that occurred at low relative abundances (Supplementary Table 4a). In addition, the deterministic genotype calling limits the number of available SNP sites for phylogenetic inference, in our case to a maximum symbiont SNP density of 0.102 SNP/ kbp (ranging from 0.006 to 0.102 SNP/kbp; Supplementary Table 4b). To circumvent these limitations, we calculated genotype probabilities and inferred genetic distances from the probability matrices for phylogenetic reconstruction. This allowed us to reconstruct phylogenies of symbionts from more O. algarvensis individuals, up to twice as many individuals than using the deterministic approach (between 3 and 131% increase; Supplementary Table S4a). Moreover, the probabilistic approach increased the robustness of our phylogenetic analyses, as it led to a substantial increase in SNP sites (ranging from 0.035 to 0.752 SNP/ kbp; Supplementary Table S4b; see Supplementary text 1.4). Our comparison of phylogenies based on deterministic and probabilistic SNP identification showed consistent phylogenetic clustering for the samples that could be analysed using both methods ( Fig. 3; Supplementary Fig. S6; Supplementary text 1.4). Therefore, we based our analyses on the probabilistic approach that allowed us to (i) include significantly more samples and (ii) identify more SNP sites to infer more robust phylogenies (Supplementary Table S4).

Congruence of symbiont and mitochondrial phylogenies varied from high to absent
To examine partner fidelity in the O. algarvensis symbiosis, we compared the phylogenies of the six most widespread symbionts with that of their hosts' mitochondrial genomes. The degree of congruence between symbiont and host phylogenies reflects the degree of fidelity between partners, with high congruence indicating strong fidelity and vice versa [12,13,20]. For the host population, their mtDNA phylogeny revealed a clear divergence between two mitochondrial lineages (termed A-and B-hosts), corresponding to the two COI haplotypes A and B (Fig. 3a). In contrast, the two locations appeared to play a smaller role in shaping mitochondrial phylogeny, with only B-hosts from Sant' Andrea forming a well-supported clade. However, the phylogenetic relationships within the mitochondrial lineages could not be fully resolved due to their limited genetic divergence ( Supplementary Fig. S7). For the six symbionts of O. algarvensis, we found marked differences in the congruence of their phylogenies with that of their hosts' mitochondrial genomes (Fig. 3 b-g). Congruence was highest in Ca. Thiosymbion, which mirrored the mitochondrial phylogeny of their hosts, with symbionts from A-and B-hosts falling into two separate clades (Fig. 3b). The congruence between the two mitochondrial lineages and the two Ca. Thiosymbion clades was 100% at both locations. That is, O. algarvensis A-hosts always had Ca. Thiosymbion A, and O. algarvensis B-hosts always had Ca. Thiosymbion B, without any exceptions. The slight differences in clade topology in Fig. 3 a and b are due to the lack of sufficient genetic divergence between individual mitochondrial haplotypes and Ca. Thiosymbion genotypes ( Supplementary Fig. S7b). The Gamma3 symbionts formed nine groups, most of which were statistically supported clades, with each group containing symbionts from either A-or B-hosts (Fig. 3c). The exception was a Gamma3 symbiont of a B-host from Cavoli that fell in a clade of A-host symbionts from Cavoli (magnified panel in Fig. 3c). Location also appeared to affect the phylogeny of Gamma3 symbionts, with symbionts from the same bay forming distinct groups.
For all other symbionts besides the Ca. Thiosymbion and Gamma3, there was no phylogenetic divergence between symbionts from A-and B-hosts (Fig. 3 d-g). Location affected the phylogeny of the Delta4 symbionts, with two well-supported clades separating symbionts from Cavoli and Sant' Andrea (Fig. 3f ). The Delta1b symbionts also clustered based on their location, but the clades were not statistically supported (Fig. 3e). For the Delta1a and spirochete, neither host mitochondrial lineage nor location affected their phylogenetic clustering ( Fig. 3 d-e, g).
We further examined congruence between the phylogenies of O. algarvensis mitochondria and its symbionts using two additional approaches. First, we compared the pairwise genetic distances of hosts and symbionts using three categories: within A-or B-hosts from the same location (within), between A-and B-hosts (mito), and between the two locations Sant' Andrea and Cavoli (location). We tested if genetic distances were explained by host mitochondrial lineage or by location, by analysing pairwise genetic distances in 'within' vs. 'mito' and 'within' vs. 'location' (Fig. 4, Table 2, Supplementary Table  S5). For the host, both the mitochondrial lineage and the location had a significant effect on genetic distances, as observed in our phylogenetic SNP analyses. For the symbionts, there was a significant effect of the mitochondrial lineage on genetic distances in Ca. Thiosymbion and Gamma3 symbionts, while the effect of location was well supported for the Gamma3 and Delta4 symbionts, again confirming our phylogenetic analyses. An effect of location on genetic distances was also significant for the Delta1b symbionts, but only for B-host symbionts, as they were not detected in A-hosts from Sant' Andrea.
We next examined whether the genetic distance of mtDNA between pairs of O. algarvensis individuals corresponded to that of their symbionts. To test for this pattern suggesting genetic co-divergence, correlations of the pairwise genetic distances were calculated between mtDNA and each of the six symbionts ( Supplementary  Fig. S8, Supplementary Table S6a). The genetic distances of the Ca. Thiosymbion symbionts had the strongest positive correlation with mtDNA genetic distances ( Table 2, Supplementary Fig. S8a, Supplementary Table S6a). For the Gamma3, Delta1a, Delta1b and Delta4 symbionts, we only observed weak positive correlations ( Supplementary  Fig. S8 b-e). Genetic co-divergence between the spirochete symbionts and host mitochondria was not detectable ( Supplementary Fig. S8f, Supplementary Table S6a). Correlation coefficients (Mantel's R) calculated above for the six symbionts showed a positive correlation with the symbionts' relative read abundances ( Fig. 2b; Supplementary Table S6b). In other words, the higher the relative abundance of a symbiont species in host individuals, the greater the degree of genetic co-divergence with O. algarvensis.

Discussion
Our analyses revealed that fidelity between the gutless annelid O. algarvensis and its endosymbiotic microbial consortium varied from strict to absent. This variability in partner fidelity likely occurred over a short, microevolutionary period, as we analysed individuals within a population of O. algarvensis from two very closely related mitochondrial lineages (0.7% divergence) that co-occurred in two bays separated by only 16 km. Our study highlights the importance of examining fidelity over microevolutionary timescales, as it was central to revealing the broad range of fidelity across the members of the O. algarvensis symbiont community from strict over intermediate to absent. Over macroevolutionary timescales, strict fidelity is disrupted in Ca. Thiosymbion (see below), and it is unlikely that we would have detected the strong to moderate levels of fidelity in other members of the O. algarvensis symbiont community over longer evolutionary time.

Varying degrees of partner fidelity indicate a spectrum of mixed modes between vertical and horizontal transmission
Different degrees of partner fidelity across the microbial consortium of O. algarvensis reflect the faithfulness with which the symbionts are transmitted from one generation to the next. The symbionts of marine gutless annelids are transmitted vertically through egg smearing, during which the egg passes through symbiont-rich tissues termed genital pads [62][63][64]. The egg is then encased in a cocoon and deposited in the sediment. This process offers opportunities for horizontal transmission of bacteria from the surrounding sediment or cooccurring hosts. Indeed, previous studies have shown that over macroevolutionary time, host switching and displacement disrupt fidelity in Ca. Thiosymbion, based on comparative phylogenetic analyses of ribosomal genes in 23 gutless annelid species [65,66]. At microevolutionary time scales, however, our analyses of O. algarvensis revealed that transmission is strictly vertical for Ca. Thiosymbion, given the strong phylogenetic congruence and genetic co-divergence between these symbionts and their hosts' mitochondria, independent of their collection site (Supplementary text 1.6). We also observed strong fidelity in the Gamma3 symbiont, with only a single switching event from an A-to a B-host in Cavoli (Fig. 3c). Interestingly, such strong fidelity was not observed in the monospecific association between the marine clam Solemya velum and its vertically transmitted sulphur-oxidizing symbionts. In these clams, repeated uptake of symbionts from the environment or contemporary hosts occurs within single host populations [20]. This indicates a lesser degree of fidelity between S. velum and its symbionts than between O. algarvensis and its sulphur-oxidizing symbionts, despite the fact that the clam symbionts are likely transmitted via the ovaries versus the presumed less-restrictive mode of egg smearing in O. algarvensis. Partner fidelity was intermediate to absent for the deltaproteobacterial and spirochete symbionts of O. algarvensis. It is therefore likely that these symbionts are regularly acquired horizontally from a free-living population or other co-occurring host individuals. For the Gamma3, Delta1b and Delta4 symbionts, we also observed an effect of their geographic location on their genetic distances (Fig. 4, Table 2). Explanations for this effect include differences in the geographic distribution of genotypes for these symbionts in the environment, exchange of symbionts between contemporary hosts and isolation-by-distance effects on partner choice, but cannot be resolved without additional in-depth analyses of the free-living symbiont population (Supplementary text 1.6).
Combinations of vertical and horizontal transmission modes in co-occurring symbionts have been shown previously, for example in humans [42,43,45], sponges [46] and corals [47]. However, these observations were based on comparisons of microbiota in parents and their immediate offspring, and did not examine partner fidelity over multiple generations. As discussed above, time scales matter, because partner fidelity decreases over  [5,48]. In other words, for any given symbiont that is transmitted vertically across one or a few generations, over longer evolutionary time, horizontal transmission events are likely to disrupt partner fidelity [49,50]. To our knowledge, the wide range of vertical and horizontal transmission modes of the O. algarvensis co-occurring symbionts has not been previously shown at fine microevolutionary time scales.

How can we explain such different degrees of partner fidelity in the O. algarvensis symbiosis?
Strong partner fidelity is widespread in associations in which the symbionts are critical for the host's survival and fitness [2,84]. In O. algarvensis, the relative abundance of symbionts was positively correlated with fidelity, with the highest fidelity detected for their sulphur-oxidizing symbionts Ca. Thiosymbion, followed by Gamma3 (42% and 28% relative read abundance, respectively; Fig. 2b). These symbionts are the primary producers for their gutless hosts by autotrophically fixing CO 2 into organic compounds [52,54,61], and all O. algarvensis individuals from both bays harboured these symbionts. Given that O. algarvensis gains all of its nutrition by digesting its endosymbionts via endocytosis [70,71], those symbionts that provide most of its nutrition are likely most strongly selected for, as they most directly affect the fitness of their host.
In contrast, selective pressures on maintaining the symbiotic association are likely relaxed for symbionts that are less critical for their host's fitness or are functionally redundant. The deltaproteobacterial, sulphate-reducing symbionts play an important role by producing reduced sulphur compounds as energy sources for the sulphuroxidizing bacteria, particularly when these compounds are limiting in the sediment environment [52]. However, all four deltaproteobacterial symbionts reduce sulphate to sulphide [52-54, 68, 69], making this trait functionally redundant among these symbionts. Correspondingly, the presence and abundance of the four sulphate-reducing symbionts varied across host individuals, with different combinations of one to three of these symbionts in each host. This pick and mix pattern indicates that O. algarvensis fitness is not dependent on any particular deltaproteobacterial symbiont species but on the presence of functional roles that they share. The lower relative abundance of reads from deltaproteobacterial symbionts (summed average 27%) compared to those of the sulphur oxidizers (70%), as well as their limited biomass [54,55], further indicates that the deltaproteobacterial symbionts are not as critical for their host's nutrition as Ca. Thiosymbion and Gamma3.
The absence of fidelity in the spirochete symbionts was unexpected, given their regular presence in O. algarvensis individuals and in other gutless annelids from around the world [56,85] (Supplementary Fig. S2). While the metabolism of this symbiont is not yet known, it was by far the least abundant symbiont in O. algarvensis in this study (3% relative read abundance) as well as previous ones [53-55, 67, 70], indicating a limited role of these symbionts in their hosts' nutrition.
In addition to selective forces, less stringent fidelity in low-abundance symbionts could also be caused by stochastic processes, for example poorer chances of faithful vertical transmission during egg smearing. Assuming that the success with which a symbiont is transmitted to the egg depends on its abundance in the parent, symbionts present in low abundance might face greater chances of not being transmitted than high abundance symbionts. Horizontal reacquisition of these symbionts from the sediment or co-occurring hosts would then explain the disruption of genetic congruence between these symbionts and O. algarvensis.

Having your cake and eating it too: the advantage of flexibility in partner fidelity
The lack of stringent partner fidelity between O. algarvensis and its deltaproteobacterial and spirochete symbionts indicates that these hosts regularly acquire novel intraspecific genotypes from the environment or from co-occurring host individuals. These newly acquired intraspecific genotypes, together with the interspecific variability in the deltaproteobacterial community across host individuals, could expand the ecological niche of O. algarvensis and enable the population to adapt to fluctuating environments. For example, new symbiont genotypes could be better adapted to the environment, provide greater flexibility in the use of resources from the environment and enable resilience to seasonal and long-term temperature changes [5,[86][87][88]. On the other hand, weak partner fidelity can be costly for the host, including the failure to find suitable symbionts, acquisition of harmful bacteria, or association with 'cheaters' that do not provide mutualistic benefits [89]. However, these costs are likely to be minimal for the deltaproteobacterial symbionts, as their partner quality may depend largely on their ability to produce reduced sulphur compounds, a trait that is critically linked to the energy metabolism of these sulphate-reducing bacteria [52][53][54], making cheating in this trait unviable for the sulphate-reducing bacteria.
For O. algarvensis, fluctuating environmental conditions may be more critical selective forces than the costs of weak partner fidelity. O. algarvensis, like other marine annelids, does not have a pelagic life stage and can likely not disperse as widely as many other infaunal invertebrates [63]. These hosts therefore face the risk of local extinction if environmental conditions become unsuitable for their symbionts. Free-living bacteria rapidly adapt to new environmental conditions [90], and horizontal acquisition of symbionts from the environment can increase the host's potential to adapt to environmental challenges rapidly [91]. The benefit of long-term survival via an evolutionary bet hedging [92,93] may therefore outweigh the cost of recruiting lessfavourable symbionts, as predicted in a recent theoretical study modelling the selective advantages of imperfect vertical transmission of symbionts in variable environments [72].

Probabilistic approaches to SNP identification expand ecological and evolutionary studies of microbial communities
Metagenomic analyses of intraspecific genetic variability in microbial communities typically rely on deep sequencing of each genotype to obtain sufficient SNPs across their genomes [77], incurring substantial sequencing costs. In this study, we reconstructed phylogenies of symbionts directly from genotype probabilities across their genomes, rather than calling genotypes. This probabilistic approach accurately identifies SNPs from low-coverage sequencing data by accounting for the uncertainties of genotyping [73]. This approach has several advantages, including (i) lower sequencing costs compared to highcoverage sequencing, meaning that for the same price, better population coverage can be achieved by sequencing more individuals, (ii) the ability to analyse the genetic diversity of low-abundance symbionts that are present in only some hosts and (iii) an increased robustness of phylogenetic analyses through the recovery of higher SNP numbers. Our study highlights how approaches using genotype probabilities can be applied to the population genetics of host-associated microbiota with variable abundances. Furthermore, our results indicate that probabilistic approaches can also be used to study evolutionary dynamics in free-living microbial communities, thus greatly expanding our toolbox for understanding noncultivable microorganisms.

Conclusions
We showed that partner fidelity varies from strict to absent in the association between the gutless marine annelid O. algarvensis and its microbial consortium. This variability in fidelity was unexpected given that these hosts transmit their symbionts vertically via egg smearing. Our results highlight the importance of examining partner fidelity at microevolutionary scales, as over longer evolutionary time, strict vertical transmission is rare in most symbioses [48,88]. Understanding the processes that drive fidelity within associations over short to long evolutionary time will help identify the benefits and costs in maintaining symbiotic associations. Such efforts should encompass increasing geographic and taxonomic scales, beginning with local host populations and expanding to regional intraspecific analyses to large-scale global analyses across host species (e.g., [39,44,94,95]). Rapid advances in high-throughput sequencing combined with substantial reductions in sequencing costs using probabilistic SNP calling now make such studies feasible and will contribute to revealing the driving forces that shape the complex and fluid nature of multimember symbioses [87,[96][97][98].

Metagenome sequencing
Sequencing libraries were constructed from the DNA extracted from single worms using a Tn5 transposase purification and tagmentation protocol [103]. The Tn5 transposase was provided by the Protein Science Facility at Karolinska Institutet SciLifeLab (Solna, Sweden). Quantity and quality of DNA samples were checked with the Quantus Fluorometer with the QuantiFluor dsDNA System (Promega Corporation, Madison, WI, USA), the Agilent TapeStation system with the DNA ScreenTape (Agilent Technologies, Santa Clara, CA, USA) and the FEMTO Pulse genomic DNA analysis kit (Advanced Analytical Technologies Inc., Heidelberg, Germany) prior to library construction. Insert template DNA was sizeselected for 400-500 bp using the AMPure XP (Beckman Coulter, Indianapolis, IN, USA). Paired-end 150-bp sequences were generated using the Illumina HiSeq3000 System (Illumina, San Diego, CA, USA) with an average total yield of 2.5 Gbp per sample (2548 ± 715 Mbp (mean ± SD)). Construction and quality control of libraries and sequencing were performed at the Max Planck Genome Centre (Cologne, Germany).
Complete mitochondrial genomes (mtDNA) were assembled from two metagenomes of O. algarvensis from Sant' Andrea representing the two COI haplotypes A and B (specimen IDs: 'OalgSANT_A04' and 'Oalg-SANT_B04, ' respectively). Duplicated sequences due to PCR-amplification during the library preparation were first removed from raw sequences using FastUniq v1.1 [112] prior to adapter clipping and quality trimming with Phred quality score ≥ 2 using Trimmomatic v0.36 [113]. A preliminary mtDNA scaffold was first generated by iterative mapping of the clean reads to a reference COI sequence of O. algarvensis (NCBI accession number KP943854, as a bait sequence) with MITObim v1.9 [114]. Mitochondrial reads were then identified by mapping to the mtDNA scaffold using bbmap in BBTools, and mtDNA was assembled from the identified reads with SPAdes. A circular mtDNA was identified on assembly graphs in Bandage and annotated using MITOS2 webserver (http:// mitos2. bioinf. uni-leipz ig. de) to confirm the completeness [115].

Characterization of symbiont community composition
The taxonomic composition of the O. algarvensis metagenomes was first screened by identifying 16S rRNA gene sequences using phyloFlash v3.3-beta1 [116] with the SILVA SSU database release 132 [117] as reference. The 16S rRNA gene sequences of O. algarvensis symbionts were assembled with SPAdes implemented in phylo-Flash and aligned using MAFFT in Geneious to identify SNP sites. Chimeric and incomplete (< 1100 bp) SSU sequences were identified in the alignment and excluded.
Relative abundances of O. algarvensis symbionts were estimated by mapping metagenome reads to a collection of symbiont-specific sequences of single-copy genes extracted from the genome bins. Orthologous singlecopy gene sequences were first identified within each of the reference symbiont MAGs with CheckM. To ensure unambiguous taxon differentiation, duplicated genes detected in each symbiont MAG (i.e. those labelled as 'contamination' in CheckM) as well as sequences sharing > 90% nucleotide identity between multiple symbiont species (checked with CD-HIT v4.5.4 [118]; 8 cases identified between the Delta1a and Delta1b symbionts) were removed from the final reference sequences of single-copy genes. Metagenomic reads (quality filtered with the same processes as for the mtDNA assembly above) matching to the singe-copy gene sequences were quantified using Kallisto v0.44.0 [119] (Supplementary text 1.1). Symbiont composition estimates were plotted in R using ggplot2 package v3.2.1 [120].
To examine whether our phylogenetic analysis of each O. algarvensis symbiont reflects a single-dominant genotype per host individual, levels of symbiont genotype diversity within a host individual were assessed by calculating SNP densities (the number of SNP sites per kbp of reference genome) with previously established procedures [82]. This analysis was performed using a set of publicly available deeply sequenced metagenomes of O. algarvensis (listed in Supplementary Table S2a) to ensure that SNP densities were estimated using sufficient symbiont read coverages, and that these estimates could be compared to those in other studies performing similar assessments [20,[80][81][82].

Identification of single-nucleotide polymorphisms and phylogenetic reconstruction
For identification of SNPs in the genomes of symbionts and mitochondria, quality-controlled reads were first unambiguously split into different symbiont species using bbsplit of BBTools, using the reference symbiont MAGs described above. Mitochondrial reads were identified with the reference mtDNA sequence derived from the specimen 'OalgSANT_A04' (Sant' Andrea, COI haplotype A) using bbmap with a minimum nucleotide identity of 95%; this step was performed to remove potential contaminations from sequences of nuclear mitochondrial pseudogenes divergent from the mtDNA [121] while ensuring successful mapping of mtDNA reads in the analysis for both A-and B-hosts given the high nucleotide identity of mtDNAs between these two lineages (> 99%; see the "Results").
To reconstruct phylogenies of symbionts and mitochondria, SNPs in genomes of symbionts and host mitochondria were identified using two approaches: based on (i) posterior genotype probabilities without genotype calling and (ii) deterministic genotyping only at genetic positions that were deeply sequenced in all samples. For the SNP identification without genotyping (i), the symbiont and mitochondrial reads identified above were mapped onto individual reference genomes of symbionts and mitochondria using bbmap. Mapping files were filtered based on mapping quality using samtools v1.3.1 [122] and BamUtil v1.0.14 [123], deduplicated with MarkDuplicates of Picard Toolkit v2.9.2 (https:// github. com/ broad insti tute/ picard) and realigned around indels with Genome Analysis Toolkit v3.7 [124]. Posterior genotype probabilities were calculated with ANGSD v0.929 [125]. ANGSD is widely used for studies of diploid organisms to infer genotypes from low-coverage sequencing data while taking sequencing errors into account [125][126][127]. To deal with haploid genotypes in ANGSD, all genotypes were assumed to be 'homozygous' by setting an inbreeding coefficient F of '1' , and a uniform prior were specified for posterior probability calculation (T.S. Korneliussen, 2018, pers. comm.). SNP sites were identified as reference nucleotide positions that were covered ≥ 1 × by all samples and showed statistically significant support (SNP p-value < 0.01). When no SNP site was found due to very low or lacking reads from a symbiont, these samples were excluded based on a cut-off of lateral coverages (i.e. % reference genetic sites covered by reads; Supplementary Table S4). For symbionts and mitochondria, their pairwise genetic distances in host individuals were obtained from the matrix of genotype probabilities using NGSdist v1.0.2 [74]. Phylogenetic trees with bootstrap support were computed from the resulting distant matrix using FastME v2.1.5.1 [128] and RAxML v8.2.11 [129].
For the SNP identification by deterministic genotyping (ii), the same symbiont and mitochondrial reads were analysed with the SNIPPY pipeline v3.2 (https:// github. com/ tseem ann/ snippy), with the same reference genomes of mtDNA and symbionts as above (Supplementary text 1.4).

Analyses of phylogenetic patterns of symbionts based on host mitochondrial lineages and locations
To examine which factors drive the patterns of genetic divergence in mitochondria and symbionts, their pairwise genetic distances were statistically compared in 3 categories of host pairs: (a) within the same combination of location plus A-or B-hosts, (b) between A-and B-hosts (A vs. B in Cavoli and A vs. B in Sant' Andrea) and (c) between locations (Cavoli vs. Sant' Andrea in A-hosts and Cavoli vs. Sant' Andrea in B-hosts). Sample pairs in each category were randomly selected without replacement to ensure data independence. For the Delta1b symbionts, we separately compared (a) vs. (b) within Cavoli and (a) vs. (c) within B-hosts, because Delta1b did not occur in A-hosts in Sant' Andrea. The statistical analyses were performed for each of the symbionts and mitochondria using Kruskal-Wallis rank sum test and Dunn post hoc tests with p-value adjustment by controlling the false-discovery rate, using R core package v3.4.2 [130] and FSA package v0.8.25 [131]. To examine genetic co-divergence patterns between a symbiont and host mtDNA, regressions of pairwise genetic distances estimated with NGSdist were examined using Mantel tests implemented in the R-package vegan v2.4.4 [132]. A correlation between Mantel's Rs and relative abundances of symbionts based on reads mapped to single-copy genes was tested with Kendall's rank correlation tau using the function test.cor in R's core package.